Apparatus and method for executing system commands based on captured image data

ABSTRACT

An apparatus and method are provided for identifying and executing system commands based on captured image data. In one implementation, a method is provided for executing at least one command retrieved from a captured image. According to the method, image data is received from an image sensor, and the image data may include printed information associated with a specific system commands. The method further includes accessing a database including a plurality of predefined system commands associated with printed information, and identifying in the image data an existence of the printed information associated with the specific system command stored in the database. The specific system command is executed after the printed information associated with the specific system command is identified.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of priority of U.S. ProvisionalPatent Application No. 61/775,603, filed Mar. 10, 2013, U.S. ProvisionalPatent Application No. 61/799,649, filed on Mar. 15, 2013, and U.S.Provisional Patent Application No. 61/830,122, filed on Jun. 2, 2013,the disclosures of which are incorporated herein by reference in theirentirety.

BACKGROUND

I. Technical Field

This disclosure generally relates to devices and methods for providinginformation to a user. More particularly, this disclosure relates todevices and methods for providing information to a user by processingimages captured from the environment of the user.

II. Background Information

Visual acuity is an indication of the clarity or clearness of a person'svision that is commonly measured twenty feet from an object. Whenmeasuring visual acuity, the ability of a person to identify blacksymbols on a white background at twenty feet is compared to the abilityof a person with normal eyesight. This comparison can be symbolized by aratio. For example, a ratio of 20/70 vision means a person located at adistance of twenty feet can see what a person with normal vision can seeat seventy feet. A person has low vision if he or she has a visualacuity between 20/70 and 20/200 in the better-seeing eye that cannot becorrected or improved with regular eyeglasses. The prevalence of lowvision is about one in a hundred for people in their sixties and rapidlyincreases to one in five for people in their nineties. Low vision mayalso depend on the environment. For example, some individuals may beable to see only when there is ample light.

A person may have low vision (also known as visual impairment) forseveral reasons. Other than eye damage and failure of the brain toreceive visual cues sent by the eyes, different medical conditions maycause visual impairment. Medical conditions that may cause visualimpairment include Age-related Macular Degeneration (AMD), retinitispigmentosa, cataract, and diabetic retinopathy.

AMD, which usually affects adults, is caused by damage to the retinathat diminishes vision in the center of a person's visual field. Thelifetime risk for developing AMD is strongly associated with certaingenes. For example, the lifetime risk of developing AMD is 50% forpeople that have a relative with AMD, versus 12% for people that do nothave relatives with AMD.

Retinitis pigmentosa is an inherited, degenerative eye disease thatcauses severe vision impairment and often blindness. The disease processbegins with changes in pigment and damage to the small arteries andblood vessels that supply blood to the retina. There is no cure forretinitis pigmentosa and no known treatment can stop the progressivevision loss caused by the disease.

A cataract is a clouding of the lens inside the eye which leads to adecrease in vision. Over time, a yellow-brown pigment is depositedwithin the lens and obstructs light from passing and being focused ontothe retina at the back of the eye. Biological aging is the most commoncause of a cataract, but a wide variety of other risk factors (e.g.,excessive tanning, diabetes, prolonged steroid use) can cause acataract.

Diabetic retinopathy is a systemic disease that affects up to 80% of allpatients who have had diabetes for ten years or more. Diabeticretinopathy causes microvascular damage to a blood-retinal barrier inthe eye and makes the retinal blood vessels more permeable to fluids.

People with low vision experience difficulties due to lack of visualacuity, field-of-view, color perception, and other visual impairments.These difficulties affect many aspects of everyday life. Persons withlow vision may use magnifying glasses to compensate for some aspects oflow vision. For example, if the smallest letter a person with 20/100vision can read is five times larger than the smallest letter that aperson with 20/20 vision can read, then 5× magnification should makeeverything that is resolvable to the person with 20/20 vision resolvableto the person with low vision. However, magnifying glasses are expensiveand cannot remedy all aspects of low vision. For example, a person withlow vision who wears magnifying glasses may still have a difficult timerecognizing details from a distance (e.g., people, signboards, trafficlights, etc.). Accordingly, there is a need for other technologies thatcan assist people who have low vision accomplish everyday activities.

SUMMARY

Embodiments consistent with the present disclosure provide devices andmethods for providing information to a user by processing imagescaptured from the environment of the user. The disclosed embodiments mayassist persons who have low vision.

Consistent with disclosed embodiments, an apparatus may be operated byat least one command retrieved from a captured image. In one aspect, theapparatus includes an image sensor configured to be worn by a user andto capture image data from an environment of the user, a mobile powersource for powering at least the image sensor, and at least one portableprocessor device configured for tethering to the image sensor. The atleast one portable processor device may be configured to access adatabase of a plurality of predefined system commands associated withprinted information in the image data, and identify in the image data anexistence of printed information associated with a specific systemcommand stored in the database. The at least one portable processordevice may be further configured to execute the specific system commandafter the printed information associated with the specific systemcommand is identified.

Consistent with additional disclosed embodiments, an apparatus may beoperated by at least one command retrieved from a captured image. In oneaspect, the apparatus may include an image sensor configured to be wornby a user and to capture image data from an environment of the user, andat least one portable processor device configured for tethering to theimage sensor. The at least one portable processor device may beconfigured to receive the image data from the image sensor. The imagedata may include printed information associated with a specific systemcommand. The at least one portable processor device may be furtherconfigured to identify in the image data an existence of the printedinformation, identify the specific system command associated with theprinted information, and execute the specific system command after thespecific system command is identified.

Consistent with further disclosed embodiments, a method for executing atleast one command retrieved from a captured image includes receivingimage data from an image sensor. In one aspect, the image data includesprinted information associated with a specific system commands. Themethod further includes accessing a database including a plurality ofpredefined system commands associated with printed information,identifying in the image data an existence of the printed informationassociated with the specific system command stored in the database, andexecuting the specific system command after the printed informationassociated with the specific system command is identified.

Consistent with other disclosed embodiments, non-transitorycomputer-readable storage media may store program instructions, whichare executed by at least one processor device and perform any of themethods described herein.

The foregoing general description and the following detailed descriptionare exemplary and explanatory only and are not restrictive of theclaims.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute apart of this disclosure, illustrate various disclosed embodiments. Inthe drawings:

FIG. 1 is a schematic illustration of a user wearing an apparatus foraiding persons who have low vision;

FIG. 2A is a schematic illustration of an example of a support from afirst viewpoint;

FIG. 2B is a schematic illustration of the support shown in FIG. 2A froma second viewpoint;

FIG. 2C is a schematic illustration of the support shown in FIG. 2Amounted on a pair of glasses;

FIG. 2D is a schematic illustration of a sensory unit attached to thesupport that is mounted on the pair of glasses shown in FIG. 2C;

FIG. 2E is an exploded view of FIG. 2D;

FIG. 3A is a schematic illustration of an example of a sensory unit froma first viewpoint;

FIG. 3B is a schematic illustration of the sensory unit shown in FIG. 3Afrom a second viewpoint;

FIG. 3C is a schematic illustration of the sensory unit shown in FIG. 3Afrom a third viewpoint;

FIG. 3D is a schematic illustration of the sensory unit shown in FIG. 3Afrom a fourth viewpoint;

FIG. 3E is a schematic illustration of the sensory unit shown in FIG. 3Ain an extended position;

FIG. 4A is a schematic illustration of an example of a processing unitfrom a first viewpoint;

FIG. 4B is a schematic illustration of the processing unit shown in FIG.4A from a second viewpoint;

FIG. 5A is a block diagram illustrating an example of the components ofan apparatus for aiding persons who have low vision according to a firstembodiment;

FIG. 5B is a block diagram illustrating an example of the components ofan apparatus for aiding persons who have low vision according to asecond embodiment;

FIG. 5C is a block diagram illustrating an example of the components ofan apparatus for aiding persons who have low vision according to a thirdembodiment;

FIG. 5D is a block diagram illustrating an example of the components ofan apparatus for aiding persons who have low vision according to afourth embodiment;

FIG. 6 illustrates an exemplary set of application modules anddatabases, according to disclosed embodiments;

FIG. 7 is a flow diagram of an exemplary process for identifying andexecuting system commands based on captured image data, according todisclosed embodiments;

FIG. 8 is a flow diagram of an exemplary process for identifying andexecuting system commands based on textual information within capturedimage data, according to disclosed embodiments;

FIG. 9 is a flow diagram of an exemplary process for executing anidentified system command, according to disclosed embodiments; and

FIGS. 10-15 illustrate exemplary image data captured by an apparatus foraiding persons who have low vision, according to disclosed embodiments.

DETAILED DESCRIPTION

The following detailed description refers to the accompanying drawings.Wherever possible, the same reference numbers are used in the drawingsand the following description to refer to the same or similar parts.While several illustrative embodiments are described herein,modifications, adaptations and other implementations are possible. Forexample, substitutions, additions or modifications may be made to thecomponents illustrated in the drawings, and the illustrative methodsdescribed herein may be modified by substituting, reordering, removing,or adding steps to the disclosed methods. Accordingly, the followingdetailed description is not limited to the disclosed embodiments andexamples. Instead, the proper scope is defined by the appended claims.

Disclosed embodiments provide devices and methods for assisting peoplewho have low vision. One example of the disclosed embodiments is adevice that includes a camera configured to capture real-time image datafrom the environment of the user. The device also includes a processingunit configured to process the real-time image data and providereal-time feedback to the user. The real-time feedback may include, forexample, an output that audibly identifies individuals from a distance,reads signboards, and/or identifies the state of a traffic light.

FIG. 1 illustrates a user 100 wearing an apparatus 110 connected toglasses 105, consistent with a disclosed embodiment. Apparatus 110 mayprovide functionality for aiding user 100 with various daily activitiesthat are otherwise difficult for user 100 to accomplish due to lowvision. Glasses 105 may be prescription glasses, magnifying glasses,nonprescription glasses, safety glasses, sunglasses, etc.

As shown in FIG. 1, apparatus 110 includes a sensory unit 120 and aprocessing unit 140. Sensory unit 120 may be connected to a support (notshown in FIG. 1) that is mounted on glasses 105. In addition, sensoryunit 120 may include an image sensor (not shown in FIG. 1) for capturingreal-time image data of the field-of-view of user 100. The term “imagedata” includes any form of data retrieved from optical signals in thenear-infrared, infrared, visible, and ultraviolet spectrums. The imagedata may be used to form video clips and/or photographs.

Processing unit 140 may communicate wirelessly or via a wire 130connected to sensory unit 120. In some embodiments, processing unit 140may produce an output of audible feedback to user 100 (e.g., using aspeaker or a bone conduction headphone).

Apparatus 110 is one example of a device capable of implementing thefunctionality of the disclosed embodiments. Other devices capable ofimplementing the disclosed embodiments include, for example, a mobilecomputer with a camera (e.g., a smartphone, a smartwatch, a tablet,etc.) or a clip-on-camera configured to communicate with a processingunit (e.g., a smartphone or a dedicated processing unit, which can becarried in a pocket). A person skilled in the art will appreciate thatdifferent types of devices and arrangements of devices may implement thefunctionality of the disclosed embodiments.

FIG. 2A is a schematic illustration of an example of a support 210. Asdiscussed in connection with FIG. 1, support 210 may be mounted onglasses 105 and connect to sensory unit 120. The term “support” includesany device or structure that enables detaching and reattaching of adevice including a camera to a pair of glasses or to another object(e.g., a helmet). Support 210 may be made from plastic (e.g.,polycarbonate), metal (e.g., aluminum), or a combination of plastic andmetal (e.g., carbon fiber graphite). Support 210 may be mounted onglasses 105 using screws, bolts, snaps, or any fastening means used inthe art.

As shown in FIG. 2A, support 210 includes a base 230 connected to aclamp 240. A bridge 220 connects base 230 with clamp 240. Base 230 andclamp 240 enable sensory unit 120 to easily attach to and detach fromsupport 210. In one embodiment, base 230 may include an internallythreaded member 250 for cooperating with a screw (not shown in FIG. 2A)to mount support 210 on glasses 105.

FIG. 2B illustrates support 210 from a second viewpoint. The viewpointshown in FIG. 2B is from a side orientation of support 210.

FIG. 2C illustrates support 210 mounted on glasses 105. Support 210 maybe configured for mounting on any kind of glasses (e.g., eyeglasses,sunglasses, 3D glasses, safety glasses, etc.). As shown in FIG. 2C,sensory unit 120 is not attached to support 210 and, accordingly,support 210 may be sold separately from apparatus 110. This arrangementmakes apparatus 110 compatible with a variety of glasses. For example,some users may have several pairs of glasses and may wish to mount asupport on each pair of glasses.

In other embodiments, support 210 may be an integral part of a pair ofglasses, or sold and installed by an optometrist. For example, support210 may be configured for mounting on the arms of glasses 105 near theframe front, but before the hinge. Alternatively, support 210 may beconfigured for mounting on the bridge of glasses 105.

FIG. 2D illustrates sensory unit 120 attached to support 210 (notvisible in FIG. 2D), and support 210 mounted on glasses 105. In someembodiments, support 210 may include a quick release mechanism fordisengaging and reengaging sensory unit 120. For example, support 210and sensory unit 120 may include magnetic elements. As an alternativeexample, support 210 may include a male latch member and sensory unit120 may include a female receptacle.

When sensory unit 120 is attached (or reattached) to support 210, thefield-of-view of a camera associated with sensory unit 120 may besubstantially identical to the field-of-view of user 100. Accordingly,in some embodiments, after support 210 is attached to sensory unit 120,directional calibration of sensory unit 120 may not be required becausesensory unit 120 aligns with the field-of-view of user 100.

In other embodiments, support 210 may include an adjustment component(not shown in FIG. 2D) to enable calibration of the aiming direction ofsensory unit 120 in a substantially set position that is customized touser 100 wearing glasses 105. For example, the adjustment component mayinclude an adjustable hinge to enable vertical and horizontal alignmentof the aiming direction of sensory unit 120. Adjusting the alignment ofsensory unit 120 may assist users who have a unique and individualvisual impairment. The adjustment may be internal or external to sensoryunit 120.

FIG. 2E is an exploded view of the components shown in FIG. 2D. Sensoryunit 120 may be attached to glasses 105 in the following way. Initially,support 210 may be mounted on glasses 105 using screw 260. Next, screw260 may be inserted into internally threaded member 250 (not shown inFIG. 2E) in the side of support 210. Sensory unit 120 may then beclipped on support 210 such that it is aligned with the field-of-view ofuser 100.

FIG. 3A is a schematic illustration of sensory unit 120 from a firstviewpoint. As shown in FIG. 3A, sensory unit 120 includes afeedback-outputting unit 340 and an image sensor 350.

Sensory unit 120 is configured to cooperate with support 210 using clip330 and groove 320, which fits the dimensions of support 210. The term“sensory unit” refers to any electronic device configured to capturereal-time images and provide a non-visual output. Furthermore, asdiscussed above, sensory unit 120 includes feedback-outputting unit 340.The term “feedback-outputting unit” includes any device configured toprovide information to a user.

In some embodiments, feedback-outputting unit 340 may be configured tobe used by blind persons and persons with low vision. Accordingly,feedback-outputting unit 340 may be configured to output nonvisualfeedback. The term “feedback” refers to any output or informationprovided in response to processing at least one image in an environment.For example, feedback may include a descriptor of a branded product, anaudible tone, a tactile response, and/or information previously recordedby user 100. Furthermore, feedback-outputting unit 340 may compriseappropriate components for outputting acoustical and tactile feedbackthat people with low vision can interpret. For example,feedback-outputting unit 340 may comprise audio headphones, a speaker, abone conduction headphone, interfaces that provide tactile cues,vibrotactile stimulators, etc.

As discussed above, sensory unit 120 includes image sensor 350. The term“image sensor” refers to a device capable of detecting and convertingoptical signals in the near-infrared, infrared, visible, and ultravioletspectrums into electrical signals. The electric signals may be used toform an image based on the detected signal. For example, image sensor350 may be part of a camera. In some embodiments, when sensory unit 120is attached to support 210, image sensor 350 may acquire a set aimingdirection without the need for directional calibration. The set aimingdirection of image sensor 350 may substantially coincide with thefield-of-view of user 100 wearing glasses 105. For example, a cameraassociated with image sensor 350 may be installed within sensory unit120 in a predetermined angle in a position facing slightly downwards(e.g., 5-15 degrees from the horizon). Accordingly, the set aimingdirection of image sensor 350 may match the field-of-view of user 100.

As shown in FIG. 3A, feedback-outputting unit 340 and image sensor 350are included in a housing 310. The term “housing” refers to anystructure that at least partially covers, protects, or encloses asensory unit. The housing may be made from one or more differentmaterials (e.g., plastic or aluminum). In one embodiment, housing 310may be designed to engage with a specific pair of glasses having aspecific support (e.g., support 210). In an alternative embodiment,housing 310 may be designed to engage more than one pair of glasses,each having a support (e.g., support 210) mounted thereon. Housing 310may include a connector for receiving power from an externalmobile-power-source or an internal mobile-power-source, and forproviding an electrical connection to image sensor 350.

FIG. 3B is a schematic illustration of sensory unit 120 from a secondviewpoint. As shown in FIG. 3B, housing 310 includes a U-shaped element.An inner distance “d” between each side of the U-shaped element islarger than the width of the arm of glasses 105. Additionally, the innerdistance “d” between each side of the U-shaped element is substantiallyequal to a width of support 210. The inner distance “d” between eachside of the U-shaped element may allow user 100 to easily attach housing310 to support 210, which may be mounted on glasses 105. As illustratedin FIG. 3B, image sensor 350 is located on one side of the U-shapedelement and feedback-outputting unit 340 is located on another side ofthe U-shaped element.

FIG. 3C is a schematic illustration of sensory unit 120 from a thirdviewpoint. The viewpoint shown in FIG. 3C is from a side orientation ofsensory unit 120 and shows the side of the U-shaped element thatincludes image sensor 350.

FIG. 3D is a schematic illustration of sensory unit 120 from a fourthviewpoint. The viewpoint shown in FIG. 3D is from an opposite side ofthe orientation shown in FIG. 3C. FIG. 3D shows the side of the U-shapedelement that includes feedback-outputting unit 340.

FIG. 3E is a schematic illustration of the sensory unit shown in FIG. 3Ain an extended position. As shown in FIG. 3E, a portion of sensory unit120 is extendable and wire 130 may pass through a channel of sensoryunit 120. This arrangement may allow a user to adjust the length and theangle of sensory unit 120 without interfering with the operation ofapparatus 110.

User 100 may adjust the U-shaped element of sensory unit 120 so thatfeedback-outputting unit 340 is positioned adjacent to the user's ear orthe user's temple. Accordingly, sensory unit 120 may be adjusted for usewith different users who may have different head sizes. Alternatively, aportion of sensory unit 120 may be flexible such that the angle offeedback-outputting unit 340 is relative to the user's ear or the user'stemple.

FIG. 4A is a schematic illustration of processing unit 140. As shown inFIG. 4A, processing unit 140 has a rectangular shape, which easily fitsin a pocket of user 100. Processing unit 140 includes a connector 400for connecting wire 130 to processing unit 140. Wire 130 may be used totransmit power from processing unit 140 to sensory unit 120, and data toand from processing unit 140 to sensory unit 120. Alternatively, wire130 may comprise multiple wires (e.g., a wire dedicated to powertransmission and a wire dedicated to data transmission).

Processing unit 140 includes a function button 410 for enabling user 100to provide input to apparatus 110. Function button 410 may acceptdifferent types of tactile input (e.g., a tap, a click, a double-click,a long press, a right-to-left slide, a left-to-right side). In someembodiments, each type of input may be associated with a differentaction. For example, a tap may be associated with the function ofconfirming an action, while a right-to-left slide may be associated withthe function of repeating the last output.

FIG. 4B is a schematic illustration of processing unit 140 from a secondviewpoint. As shown in FIG. 4B, processing unit 140 includes a volumeswitch 420, a battery pack compartment 430, and a power port 440. In oneembodiment, user 100 may charge apparatus 110 using a chargerconnectable to power port 440. Alternatively, user 100 may replace abattery pack (not shown) stored in battery pack compartment 430.

FIG. 5A is a block diagram illustrating the components of apparatus 110according to a first embodiment. Specifically, FIG. 5A depicts anembodiment in which apparatus 110 comprises sensory unit 120 andprocessing unit 140, as discussed in connection with, for example,FIG. 1. Furthermore, sensory unit 120 may be physically coupled tosupport 210.

As shown in FIG. 5A, sensory unit 120 includes feedback-outputting unit340 and image sensor 350. Although one image sensor is depicted in FIG.5A, sensory unit 120 may include a plurality of image sensors (e.g., twoimage sensors). For example, in an arrangement with more than one imagesensor, each of the image sensors may be face a different direction orbe associated with a different camera (e.g., a wide angle camera, anarrow angle camera, an IR camera, etc.). In other embodiments (notshown in the figure) sensory unit 120 may also include buttons and othersensors such as a microphone and inertial measurements devices.

As further shown in FIG. 5A, sensory unit 120 is connected to processingunit 140 via wire 130. Processing unit 140 includes a mobile powersource 510, a memory 520, a wireless transceiver 530, and a processor540.

Processor 540 may constitute any physical device having an electriccircuit that performs a logic operation on input or inputs. For example,processor 540 may include one or more integrated circuits, microchips,microcontrollers, microprocessors, all or part of a central processingunit (CPU), graphics processing unit (GPU), digital signal processor(DSP), field-programmable gate array (FPGA), or other circuits suitablefor executing instructions or performing logic operations. Theinstructions executed by processor 540 may, for example, be pre-loadedinto a memory integrated with or embedded into processor 540 or may bestored in a separate memory (e.g., memory 520). Memory 520 may comprisea Random Access Memory (RAM), a Read-Only Memory (ROM), a hard disk, anoptical disk, a magnetic medium, a flash memory, other permanent, fixed,or volatile memory, or any other mechanism capable of storinginstructions.

Although one processor is shown in FIG. 5A, processing unit 140 mayinclude more than one processor. Each processor may have a similarconstruction or the processors may be of differing constructions thatare electrically connected or disconnected from each other. For example,the processors may be separate circuits or integrated in a singlecircuit. When more than one processor is used, the processors may beconfigured to operate independently or collaboratively. The processorsmay be coupled electrically, magnetically, optically, acoustically,mechanically or by other means that permit them to interact.

In some embodiments, processor 540 may change the aiming direction ofimage sensor 350 using image data provided from image sensor 350. Forexample, processor 540 may recognize that a user is reading a book anddetermine that the aiming direction of image sensor 350 is offset fromthe text. That is, because the words in the beginning of each line oftext are not fully in view, processor 540 may determine that imagesensor 350 is tilted down and to the right. Responsive thereto,processor 540 may adjust the aiming direction of image sensor 350.

Processor 540 may access memory 520. Memory 520 may be configured tostore information specific to user 100. For example, data for imagerepresentations of known individuals, favorite products, personal items,etc., may be stored in memory 520. In one embodiment, user 100 may havemore than one pair of glasses, with each pair of glasses having support210 mounted thereon. Accordingly, memory 520 may store information(e.g., personal settings) associated with each pair of glasses. Forexample, when a user wears his sunglasses may have different preferencesthan when the user wears reading glasses.

As shown in FIG. 5A, processing unit 140 includes mobile power source510. Mobile power source 510 may be configured to power processing unit140 and/or sensory unit 120. The term “mobile power source” includes anydevice capable of providing electrical power, which can be easilycarried by a hand (e.g., the total weight of mobile power source 510 maybe less than a pound). Thus, the mobility of the power source enablesuser 100 to use apparatus 110 in a variety of situations. For example,mobile power source 510 may include one or more batteries (e.g.,nickel-cadmium batteries, nickel-metal hydride batteries, andlithium-ion batteries) or any other type of electrical power supply. Insome embodiments, mobile power source 510 may be rechargeable andcontained within a casing that holds processing unit 140. In otherembodiments, mobile power source 510 may include one or more energyharvesting devices for converting ambient energy into electrical energy(e.g., portable solar power units, human vibration units, etc.).

Apparatus 110 may operate in a low-power-consumption mode and in aprocessing-power-consumption mode. For example, mobile power source 510can produce five hours of processing-power-consumption mode and fifteenhours of low-power-consumption mode. Accordingly, different powerconsumption modes may allow mobile power source 510 to producesufficient power for powering processing unit 140 for various timeperiods (e.g., more than two hours, more than four hours, more than tenhours, etc.).

Mobile power source 510 may power one or more wireless transceivers(e.g., wireless transceiver 530 in FIG. 5A). The term “wirelesstransceiver” refers to any device configured to exchange transmissionsover an air interface by use of radio frequency, infrared frequency,magnetic field, or electric field. Wireless transceiver 530 may use anyknown standard to transmit and/or receive data (e.g., Wi-Fi, Bluetooth®,Bluetooth Smart, 802.15.4, or ZigBee). In some embodiments, wirelesstransceiver 530 may transmit data (e.g., raw image data or audio data)from image sensor 350 to processing unit 140, or wireless transceiver530 may transmit data from processing unit 140 to feedback-outputtingunit 340.

In another embodiment, wireless transceiver 530 may communicate with adifferent device (e.g., a hearing aid, the user's smartphone, or anywirelessly controlled device) in the environment of user 100. Forexample, wireless transceiver 530 may communicate with an elevator usinga Bluetooth® controller. In such an arrangement, apparatus 110 mayrecognize that user 100 is approaching an elevator and call theelevator, thereby minimizing wait time. In another example, wirelesstransceiver 530 may communicate with a smart TV. In such an arrangement,apparatus 110 may recognize that user 100 is watching television andidentify specific hand movements as commands for the smart TV (e.g.,switching channels). In yet another example, wireless transceiver 530may communicate with a virtual cane. A virtual cane is any device thatuses a laser beam or ultrasound waves to determine the distance fromuser 100 to an object.

FIG. 5B is a block diagram illustrating the components of apparatus 110according to a second embodiment. In FIG. 5B, similar to the arrangementshown in FIG. 5A, support 210 is used to couple sensory unit 120 to apair of glasses. However, in the embodiment shown in FIG. 5B, sensoryunit 120 and processing unit 140 communicate wirelessly. For example,wireless transceiver 530A can transmit image data to processing unit 140and receive information to be outputted via feedback-outputting unit340.

In this embodiment, sensory unit 120 includes feedback-outputting unit340, mobile power source 510A, wireless transceiver 530A, and imagesensor 350. Mobile power source 510A is contained within sensory unit120. As further shown in FIG. 5B, processing unit 140 includes wirelesstransceiver 530B, processor 540, mobile power source 510B, and memory520.

FIG. 5C is a block diagram illustrating the components of apparatus 110according to a third embodiment. In particular, FIG. 5C depicts anembodiment in which support 210 includes image sensor 350 and connector550B. In this embodiment, sensory unit 120 provides functionality forprocessing data and, therefore, a separate processing unit is not neededin such a configuration.

As shown in FIG. 5C, sensory unit 120 includes processor 540, connector550A, mobile power source 510, memory 520, and wireless transceiver 530.In this embodiment, apparatus 110 does not include a feedback-outputtingunit. Accordingly, wireless transceiver 530 may communicate directlywith a hearing aid (e.g., a Bluetooth® hearing aid). In addition, inthis embodiment, image sensor 350 is included in support 210.Accordingly, when support 210 is initially mounted on glasses 105, imagesensor 350 may acquire a set aiming direction. For example, a cameraassociated with image sensor 350 may be installed within support 210 ina predetermined angle in a position facing slightly downwards (e.g.,7-12 degrees from the horizon). Furthermore, connector 550A andconnector 550B may allow data and power to be transmitted betweensupport 210 and sensory unit 120.

FIG. 5D is a block diagram illustrating the components of apparatus 110according to a fourth embodiment. In FIG. 5D, sensory unit 120 couplesdirectly to a pair of glasses without the need of a support. In thisembodiment, sensory unit 120 includes image sensor 350,feedback-outputting unit 340, processor 540, and memory 520. As shown inFIG. 5D, sensory unit 120 is connected via a wire 130 to processing unit140. Additionally, in this embodiment, processing unit 140 includesmobile power source 510 and wireless transceiver 530.

As will be appreciated by a person skilled in the art having the benefitof this disclosure, numerous variations and/or modifications may be madeto the disclosed embodiments. Not all components are essential for theoperation of apparatus 110. Any component may be located in anyappropriate part of apparatus 110 and the components may be rearrangedinto a variety of configurations while providing the functionality ofthe disclosed embodiments. Therefore, the foregoing configurations areexamples and, regardless of the configurations discussed above,apparatus 110 can assist persons who have low vision with their everydayactivities in numerous ways.

One way apparatus 110 can assist persons who have low vision is byidentifying relevant objects in an environment. For example, in someembodiments, processor 540 may execute one or more computer algorithmsand/or signal-processing techniques to find objects relevant to user 100in image data captured by sensory unit 120. The term “object” refers toany physical object, person, text, or surroundings in an environment.

In one embodiment, apparatus 110 can perform a hierarchical objectidentification process. In a hierarchical object identification process,apparatus 110 can identify objects from different categories (e.g.,spatial guidance, warning of risks, objects to be identified, text to beread, scene identification, and text in the wild) of image data. Forexample, apparatus 110 can perform a first search in the image data toidentify objects from a first category, and after initiating the firstsearch, execute a second search in the image data to identify objectsfrom a second category.

In another embodiment, apparatus 110 can provide information associatedwith one or more of the objects identified in image data. For example,apparatus 110 can provide information such as the name of an individualstanding in front of user 100. The information may be retrieved from adynamic database stored in memory 520. If the database does not containspecific information associated with the object, apparatus 110 mayprovide user 100 with nonvisual feedback indicating that a search wasmade, but the requested information was not found in the database.Alternatively, apparatus 110 may use wireless transceiver 530 to searchfor and retrieve information associated with the object from a remotedatabase (e.g., over a cellular network or Wi-Fi connection to theInternet).

Another way apparatus 110 can assist persons who have low vision is byperforming a continuous action that relates to an object in anenvironment. A continuous action may involve providing continuousfeedback regarding the object. For example, apparatus 110 can providecontinuous feedback associated with an object identified within afield-of-view of image sensor 350, and suspend the continuous feedbackwhen the object moves outside the field-of-view of image sensor 350.Examples of continuous feedback may include audibly reading text,playing a media file, etc. In addition, in some embodiments, apparatus110 may provide continuous feedback to user 100 based on informationderived from a discrete image or based on information derived from oneor more images captured by sensory unit 120 from the environment of user100.

Another type of continuous action includes monitoring the state of anobject in an environment. For example, in one embodiment, apparatus 110can track an object as long as the object remains substantially withinthe field-of-view of image sensor 350. Furthermore, before providinguser 100 with feedback, apparatus 110 may determine whether the objectis likely to change its state. If apparatus 110 determines that theobject is unlikely to change its state, apparatus 110 may provide afirst feedback to user 100. For example, if user 100 points to a roadsign, apparatus 110 may provide a first feedback that comprises adescriptor of the road sign. However, if apparatus 110 determines thatthe object is likely to change its state, apparatus 110 may provide asecond feedback to user 100 after the object has changed its state. Forexample, if user 100 points at a traffic light, the first feedback maycomprise a descriptor of the current state of the traffic light (e.g.,the traffic light is red) and the second feedback may comprise adescriptor indicating that the state of traffic light has changed (i.e.,the traffic light is now green).

Apparatus 110 may also determine that an object that is expected tochange its state is not functioning and provide appropriate feedback.For example, apparatus 110 may provide a descriptor indicating that atraffic light is broken.

Apparatus 110 can also assist persons who have low vision by makingintelligent decisions regarding a person's intentions. Apparatus 110 canmake these decisions by understanding the context of a situation.Accordingly, disclosed embodiments may retrieve contextual informationfrom captured image data and adjust the operation of apparatus 110 basedon at least the contextual information. The term “contextualinformation” (or “context”) refers to any information having a direct orindirect relationship with an object in an environment. In someembodiments, apparatus 110 may retrieve different types of contextualinformation from captured image data. One type of contextual informationis the time and/or the place that an image of the object was captured.Another example of a type of contextual information is the meaning oftext written on the object. Other examples of types of contextualinformation include the identity of an object, the type of the object,the background of the object, the location of the object in the frame,the physical location of the user relative to the object, etc.

In an embodiment, the type of contextual information that is used toadjust the operation of apparatus 110 may vary based on objectsidentified in the image data and/or the particular user who wearsapparatus 110. For example, when apparatus 110 identifies a package ofcookies as an object, apparatus 110 may use the location of the package(i.e., at home or at the grocery store) to determine whether or not toread the list of ingredients aloud. Alternatively, when apparatus 110identifies a signboard identifying arrival times for trains as anobject, the location of the sign may not be relevant, but the time thatthe image was captured may affect the output. For example, if a train isarriving soon, apparatus 110 may read aloud the information regardingthe coming train. Accordingly, apparatus 110 may provide differentresponses depending on contextual information.

Apparatus 110 may use contextual information to determine a processingaction to execute or an image resolution of image sensor 350. Forexample, after identifying the existence of an object, contextualinformation may be used to determine if the identity of the objectshould be announced, if text written on the object should be audiblyread, if the state of the object should be monitored, or if an imagerepresentation of the object should be saved. In some embodiments,apparatus 110 may monitor a plurality of images and obtain contextualinformation from specific portions of an environment. For example,motionless portions of an environment may provide background informationthat can be used to identify moving objects in the foreground.

Yet another way apparatus 110 can assist persons who have low vision isby automatically carrying out processing actions after identifyingspecific objects and/or hand gestures in the field-of-view of imagesensor 350. For example, processor 540 may execute several actions afteridentifying one or more triggers in image data captured by apparatus110. The term “trigger” includes any information in the image data thatmay cause apparatus 110 to execute an action. For example, apparatus 110may detect as a trigger a finger of user 100 pointing to one or morecoins. The detection of this gesture may cause apparatus 110 tocalculate a sum of the value of the one or more coins. As anotherexample of a trigger, an appearance of an individual wearing a specificuniform (e.g., a policeman, a fireman, a nurse) in the field-of-view ofimage sensor 350 may cause apparatus 110 to make an audible indicationthat this particular individual is nearby.

In some embodiments, the trigger identified in the image data mayconstitute a hand-related trigger. The term “hand-related trigger”refers to a gesture made by, for example, the user's hand, the user'sfinger, or any pointed object that user 100 can hold (e.g., a cane, awand, a stick, a rod, etc.).

In other embodiments, the trigger identified in the image data mayinclude an erratic movement of an object caused by user 100. Forexample, unusual movement of an object can trigger apparatus 110 to takea picture of the object. In addition, each type of trigger may beassociated with a different action. For example, when user 100 points totext, apparatus 110 may audibly read the text. As another example, whenuser 100 erratically moves an object, apparatus 110 may audibly identifythe object or store the representation of that object for lateridentification.

Apparatus 110 may use the same trigger to execute several actions. Forexample, when user 100 points to text, apparatus 110 may audibly readthe text. As another example, when user 100 points to a traffic light,apparatus 110 may monitor the state of the traffic light. As yet anotherexample, when user 100 points to a branded product, apparatus 110 mayaudibly identify the branded product. Furthermore, in embodiments inwhich the same trigger is used for executing several actions, apparatus110 may determine which action to execute based on contextualinformation retrieved from the image data. In the examples above,wherein the same trigger (pointing to an object) is used, apparatus 110may use the type of the object (text, a traffic light, a brandedproduct) to determine which action to execute.

To assist user 100 throughout his or her daily activities, apparatus 100may follow several procedures for saving processing resources andprolonging battery life. For example, apparatus 110 can use severalimage resolutions to form images. Higher image resolution provides moredetailed images, but requires more processing resources. Lower imageresolution provides less detailed images, but saves processingresources. Therefore, to prolong battery life, apparatus 110 may haverules for capturing and processing high resolution image under certaincircumstances, and rules for capturing and processing low resolutionimage when possible. For example, apparatus 110 may capture higherresolution images when performing Optical Character Recognition (OCR),and capture low resolution images when searching for a trigger.

One of the common challenges persons with low vision face on a dailybasis is reading. Apparatus 110 can assist persons who have low visionby audibly reading text that is present in user 100 environment.Apparatus 110 may capture an image that includes text using sensory unit120. After capturing the image, to save resources and to processportions of the text that are relevant to user 100, apparatus 110 mayinitially perform a layout analysis on the text. The term “layoutanalysis” refers to any process of identifying regions in an image thatincludes text. For example, layout analysis may detect paragraphs,blocks, zones, logos, titles, captions, footnotes, etc.

In one embodiment, apparatus 110 can select which parts of the image toprocess, thereby saving processing resources and battery life. Forexample, apparatus 110 can perform a layout analysis on image data takenat a resolution of one megapixel to identify specific areas of interestwithin the text. Subsequently, apparatus 110 can instruct image sensor350 to capture image data at a resolution of five megapixels torecognize the text in the identified areas. In other embodiments, thelayout analysis may include initiating at least a partial OCR process onthe text.

In another embodiment, apparatus 110 may detect a trigger thatidentifies a portion of text that is located a distance from a levelbreak in the text. A level break in the text represents anydiscontinuity of the text (e.g., a beginning of a sentence, a beginningof a paragraph, a beginning of a page, etc.). Detecting this trigger maycause apparatus 110 to read the text aloud from the level breakassociated with the trigger. For example, user 100 can point to aspecific paragraph in a newspaper and apparatus 110 may audibly read thetext from the beginning of the paragraph instead of from the beginningof the page.

In addition, apparatus 110 may identify contextual informationassociated with text and cause the audible presentation of one portionof the text and exclude other portions of the text. For example, whenpointing to a food product, apparatus 110 may audibly identify thecalorie value of the food product. In other embodiments, contextualinformation may enable apparatus 110 to construct a specific feedbackbased on at least data stored in memory 520. For example, the specificfeedback may assist user 100 to fill out a form (e.g., by providing user100 audible instructions and details relevant to a form in the user'sfield-of-view).

To improve the audible reading capabilities of apparatus 110, processor540 may use OCR techniques. The term “optical character recognition”includes any method executable by a processor to retrievemachine-editable text from images of text, pictures, graphics, etc. OCRtechniques and other document recognition technology typically use apattern matching process to compare the parts of an image to samplecharacters on a pixel-by-pixel basis. This process, however, does notwork well when encountering new fonts, and when the image is not sharp.Accordingly, apparatus 110 may use an OCR technique that compares aplurality of sets of image regions that are proximate to each other.Apparatus 110 may recognize characters in the image based on statisticsrelate to the plurality of the sets of image regions. By using thestatistics of the plurality of sets of image regions, apparatus 110 canrecognize small font characters defined by more than four pixels e.g.,six or more pixels. In addition, apparatus 110 may use several imagesfrom different perspectives to recognize text on a curved surface. Inanother embodiment, apparatus 110 can identify in image data anexistence of printed information associated with a system command storedin a database and execute the system command thereafter. Examples of asystem command include: “enter training mode,” “enter airplane mode,”“backup content,” “update operating system,” etc.

The disclosed OCR techniques may be implemented on various devices andsystems and are not limited to use with apparatus 110. For example, thedisclosed OCR techniques provide accelerated machine reading of text. Inone embodiment, a system is provided for audibly presenting a first partof a text from an image, while recognizing a subsequent part of thetext. Accordingly, the subsequent part may be presented immediately uponcompletion of the presentation of the first part, resulting in acontinuous audible presentation of standard text in less than twoseconds after initiating OCR.

As is evident from the foregoing, apparatus 110 may provide a wide rangeof functionality. More specifically, in embodiments consistent with thepresent disclosure, apparatus 110 may capture image data that includestextual and non-textual information disposed within a field-of-view ofsensory unit 120, identify one or more system commands associated withthe textual information and non-textual information, and subsequentlyexecute the one or more system commands automatically or in response toan input received from a user of apparatus 110.

In certain aspects, “textual information” consistent with the disclosedembodiments may include, but is not limited to, printed text,handwritten text, coded text, text projected onto a correspondingsurface, text displayed to the user through a corresponding displayscreen or touchscreen, and any additional or alternate textualinformation appropriate to the user and to apparatus 110. Further, the“non-textual information” may include, but is not limited to, images ofvarious triggers (e.g., a human appendage, a cane, or a pointer), imagesof physical objects, images of persons, images of surroundings, andimages of other non-textual objects disposed within the field-of-view ofsensory unit 120.

In certain aspects, apparatus 110 may perform an OCR process on thetextual information within the captured image data, and may subsequentlyidentify the one or more system commands based on portions of therecognized text. In other aspects, apparatus 110 may detect elements ofnon-textual information within the captured image data, and may initiatethe identification of the one or more system commands in response to thedetected non-textual information.

In an embodiment, apparatus 110 may include a memory (e.g., memory 520)configured to store one or more applications and application modulesthat, when executed by a processor (e.g., processor 540), enableapparatus 110 identify and execute system commands based in textual andnon-textual information within captured image data. In certain aspects,memory 520 may also be configured to store information that identifiesthe system commands and associates the system commands with elements oftextual information (e.g., characters, words, phrases, and phrases),elements of non-textual information (e.g., images of triggers, physicalobjects, and persons), other system commands, and other events. FIG. 6illustrates an exemplary structure of memory 520, in accordance withdisclosed embodiments.

In FIG. 6, memory 520 may be configured to store an image data storagemodule 602, an image processing module 604, and an image database 612.In one embodiment, image data storage module 602, upon execution byprocessor 540, may enable processor 540 to receive data corresponding toone or more images captured by sensory unit 120, and to store thecaptured image data within image database 612. In some aspects, thecaptured image data may include textual information (e.g., printed,handwritten, coded, projected, and/or displayed text) and non-textualinformation (e.g., images of physical objects, persons, and/ortriggers), and processor 540 may store the image data in image database612 with additional data specifying a time and/or date at which sensoryunit 120 captured the image data. In additional embodiments, image datastorage module 602 may further enable processor 540 to configurewireless transceiver 530 to transmit the captured image data to one ormore devices (e.g., an external data repository or a user's mobiledevice) in communication with apparatus 110 across a corresponding wiredor wireless network.

In an embodiment, image processing module 604, upon execution byprocessor 540, may enable processor 540 to process the captured imagedata and identify elements of textual information within the capturedimage data. In certain aspects, textual information consistent with thedisclosed embodiments may include, but is not limited to, printed text(e.g., text disposed on a page of a newspaper, magazine, book),handwritten text, coded text, text displayed to a user through a displayunit of a corresponding device (e.g., an electronic book, a television aweb page, or an screen of a mobile application), text disposed on a flator curved surface of an object within a field-of-view of apparatus 110(e.g., a billboard sign, a street sign, text displayed on productpackaging), text projected onto a corresponding screen (e.g., duringpresentation of a movie at a theater), and any additional or alternatetext disposed within images captured by sensory unit 120.

In certain aspects, processor 540 may perform a layout analysis of theimage data to identify textual information within the captured imagedata. By way of example, processor 540 may perform a layout analysis todetect paragraphs of text, blocks of text, zones and/or regions thatinclude text, logos, titles, captions, footnotes, and any additional oralternate portions of the image data that includes printed, handwritten,displayed, coded, and/or projected text.

Referring back to FIG. 6, memory 520 may also include an opticalcharacter recognition (OCR) module 606 that, upon execution by processor540, enables processor 540 to perform one or more OCR processes onelements of textual information disposed within the image data. In oneembodiment, processor 540 may execute image processing module 604 toidentify portions of the captured image data that include textualinformation, and further, may execute OCR module 606 to retrievemachine-readable text from the textual information.

Memory 520 may also be configured to store a system commandidentification module 608, a system command execution module 610, and asystem command database 614. In one embodiment, system command database140 may store linking information that associates one or more systemcommands with corresponding portions of captured image data. In someaspects, a system command may include one or more instructions that,when executed by processor 540, cause processor 540 to perform one ormore actions or processes consistent with an operating system ofapparatus 110. Further, in one aspect, linking information may associatea particular system command with an element of recognized text (e.g., aword, a phrase, or a paragraph), an element of non-textual information(e.g., an image of a physical object, a person, or a trigger),combinations thereof, and any additional or alternate indicia oflinkages between captured image data and system commands.

In an embodiment, system command identification module 608 may, uponexecution by processor 540, enable processor 540 to access linkinginformation stored within system command database 614, and to identifyone or more system commands associated with portions of the capturedimage data based on the linking information. For example, processor 540may leverage the accessed linking information to determine that aportion of machine-readable text corresponds to a system commandexecutable by processor 540. Additionally or alternatively, systemcommand identification module 608 may enable processor 540 to identify asystem command associated with a particular image within the capturedimage data, and further, a system command associated with a particulartrigger in the captured image data, taken alone or in conjunction withmachine-readable text.

System command execution module 610 may, upon execution by processor540, enable processor 540 to execute the identified system command andperform one or more actions and processes consistent with the operatingsystem of apparatus 110. In one instance, the identified system commandmay enable processor 540 to modify an operational state of apparatus110. For example, upon execution of the identified system command byprocessor 540, apparatus 110 may function in accordance with a“training” mode, a “sleep” mode, or an “airplane” mode, or other mode ofoperation consistent with the captured image data and the operatingsystem of apparatus 110.

In other aspects, the identified system command may enable processor 540to modify a configuration of apparatus 110. By way of example, uponexecution of the identified system command, processor 540 may modify aconfiguration of one or more of sensory unit 120 and processing unit130. Further, in additional embodiments, processor 540 may execute theidentified system command to modify a configuration of an externaldevice in communication with apparatus 110 across a corresponding wiredor wireless communications network (e.g., a mobile telephone, a smartphone, or a tablet computer).

Additionally or alternatively, the identified system command may enableprocessor 540 to execute one or more applications and/or perform one ormore actions supported by an operating system of apparatus 110. By wayof example, processor 540 may initiate or terminate a recording of audioor video content, download a stored digital image (e.g., to imagedatabase 612), transmit a stored digital image to an external device incommunication with apparatus 110, update or restart the operating systemof apparatus 110, and establish, modify, or erase a user customizationof apparatus 110.

Further, in an embodiment, the identified system command may include aplurality of steps associated with corresponding system sub-commands,and processor 540 may be configured to execute sequentially thecorresponding system sub-commands upon execution of system commandexecution module 610. For example, the identified system command mayupdate the operating system of apparatus 110, and the correspondingsub-commands may enable processor 540 to obtain an updated version ofthe operating system, replace the existing version of the operatingsystem with the updated version, and restart apparatus 110 uponcompletion of the replacement.

In further embodiments, the identified system command may causeprocessor 540 perform an action on one or more files stored locally byapparatus 110, and additionally or alternatively, on one or more filesstored within a data repository or external device accessible toapparatus 110 over a corresponding communications network. By way ofexample, upon execution of the identified system command, processor 540may store image, video, and/or audio files in memory 520, overwrite oneor more files stored within memory 520, and additionally oralternatively, transmit one or more files stored within memory 520 tothe external device.

In other embodiments, image database 612 and/or contextual rule database614 may be located remotely from memory 520, and be accessible to othercomponents of apparatus 110 (e.g., processing unit 140) via one or morewireless connections (e.g., a wireless network). While two databases areshown, it should be understood that image database 612 and contextualrule database 614 may be combined and/or interconnected databases maymake up the databases. Image database 612 and/or contextual ruledatabase 614 may further include computing components (e.g., databasemanagement system, database server, etc.) configured to receive andprocess requests for data stored in associated memory devices.

Image data store module 602, image processing module 604, OCR module606, system command identification module 608, and system commandexecution module 610 may be implemented in software, hardware, firmware,a mix of any of those, or the like. For example, if the modules areimplemented in software, they may be stored in memory 520, as shown inFIG. 6. Other components of processing unit 140 and/or sensory unit 120may be configured to perform processes to implement and facilitateoperations of image data store module 602, image processing module 604,OCR module 606, system command identification module 608, and systemcommand execution module 610. Thus, image data store module 602, imageprocessing module 604, OCR module 606, system command identificationmodule 608, and system command execution module 610 may includesoftware, hardware, or firmware instructions (or a combination thereof)executable by one or more processors (e.g., processor 540), alone or invarious combinations with each other. For example, image data storemodule 602, image processing module 604, OCR module 606, system commandidentification module 608, and system command execution module 610 maybe configured to interact with each other and/or other modules ofapparatus 110 to perform functions consistent with disclosedembodiments. In some embodiments, any of the disclosed modules (e.g.,image data store module 602, image processing module 604, OCR module606, system command identification module 608, and system commandexecution module 610) may each include dedicated sensors (e.g., IR,image sensors, etc.) and/or dedicated application processing devices toperform the functionality associated with each module.

FIG. 7 is a flow diagram of an exemplary process 700 for identifying andexecuting system commands based on captured image data, according todisclosed embodiments. As described above, sensory unit 120 may captureimage data that includes textual information and non-textual informationdisposed within a corresponding field-of-view. Processing unit 130 mayreceive the captured image data, and processor 540 may execute one ormore application modules to identify the textual and non-textualinformation, and to execute one or more system commands that correspondto the identified textual and non-textual information. Process 700provides further details on how processor 540 identifies and executesone or more system commands based on captured image data.

In step 702, processor 540 may obtain captured image data. In someaspects, sensory unit 120 may capture one or more images, and thecaptured image data may be transmitted to processing unit 140 acrosswired or wireless communications link 130. Processor 540 may, in step702, obtain the captured image data directly from sensory module 120across communications link 130, or alternatively, processor 540 mayretrieve the captured image data from a corresponding data repository(e.g., image database 612 of memory 520). By way of example, thecaptured image data may include one or more regions of printed,displayed, or projected information.

In step 704, processor 540 may analyze the captured image data toidentify portions of the captured image data that include textualinformation. As described above, the textual information may include,but is not limited to, printed, handwritten, projected, coded, ordisplayed text, and processor 540 may perform a layout analysis todetect the textual information within the captured image data. By way ofexample, the detected portions may include, but are not limited to,paragraphs of text, blocks of text, zones and/or regions that includetext, logos, titles, captions, footnotes, and any additional oralternate portions of the captured image data that includes printed,handwritten, displayed, coded, and/or projected text.

Additionally or alternatively, processor 540 may analyze the capturedimage data using image processing techniques in step 704 to identifynon-textual information within the captured image data. In certainaspects, the non-textual information may include, but is not limited to,an image of a trigger (e.g., a human appendage or a cane), an image of aperson (e.g., a police officer, a firefighter, or an airline employee),an image of a physical object (e.g., a streetlight and/or a pedestriancrossing signal, a particular vehicle, a map), and any additional oralternate image of relevant to the user of apparatus 110.

In step 706, processor 540 may identify one or more system commandsassociated with the textual and non-textual information. In oneembodiment, to identify the one or more system commands, processor 540obtain linking information in step 706 that associates the systemcommands with corresponding portions of textual information, non-textualinformation, or combinations of textual and non-textual information. Incertain aspects, processor 540 may access system command database 614 toobtain the linking information. Alternatively, processor 540 may obtainthe linking information from a data repository in communication withapparatus 110 across a corresponding communications network usingappropriate communications protocols.

For example, processor 540 may determine in step 706 that a systemcommand is associated with textual information when the linkinginformation for that system command includes at least a portion of thetextual information. Additionally or alternatively, processor 540 mayalso determine in step 706 that the system command is associated withnon-textual information when the linking information for that systemcommand includes information identifying the non-textual information,either alone or in combination with textual information.

Processor 540 may execute the one or more identified system commands instep 708. Upon execution of the one or more identified system commandsby processor 540, exemplary process 700 ends.

As described above, one or more of the executed system commands maycorrespond to an operation that modifies a functional state of apparatus110 or an external device in communications with apparatus 110 (e.g.,that causes apparatus 110 to enter a sleep mode, a training mode, or anairplane mode). The executed system commands may also correspond toprocesses performed by and supported by an operating system of apparatus110, which include, but are not limited to, processes that initiate orterminate a recording of audio or video content, download a storeddigital image (e.g., to image database 612), perform an action on one ormore files associated with a user of apparatus 110, transmit a storeddigital image to an external device in communication with apparatus 110,update or restart the operating system of apparatus 110, backup storedcontent, obtain information indicative of a status of a battery ofapparatus 110, obtain audible instructions regarding one or morefunctions of apparatus 110, and establish, modify, or erase a usercustomization of apparatus 110 (e.g., a volume associated with apparatus110 or a gender of an audible narration provided by apparatus 110).Further, and consistent with the disclosed embodiments, at least one ofthe system commands may be associated with a plurality of steps, whichcorrespond to system sub-commands executed sequentially by processor540. The disclosed embodiments are, however, not limited to suchexemplary system commands, and in additional embodiments, processor 540may identify (e.g., step 706) and execute (e.g., step 708) anyadditional or alternate system command appropriate to processor 540,apparatus 110, and the captured image data.

FIG. 8 is a flow diagram of an exemplary process 800 for identifying andexecuting system commands based on text within captured image data,according to disclosed embodiments. As described above, sensory unit 120of apparatus 110 may capture image data that includes textualinformation. In some embodiments, processor 540 of apparatus 510 mayexecute one or more application modules to identify the textualinformation within the captured image data, retrieve machine-readabletext from the identified textual information, and execute one or moresystem commands associated with the machine-readable text. Process 800provides further details on how processor 540 identifies and executesone or more system commands based on text disposed within portions ofcaptured image data.

In step 802, processor 540 may obtain captured image data. In someaspects, sensory unit 120 may capture one or more images, and thecaptured image data may be transmitted to processing unit 140 acrosswired or wireless communications link 130. As described above, processor540 may obtain the captured image data in step 802 from sensory module120 across communications link 130, or alternatively, processor 540 mayretrieve the captured image data from a corresponding data repository(e.g., image database 612 of memory 520). The captured image data may,in certain aspects, include at least one of textual information (e.g.,printed text, handwritten text, displayed text, projected text, andcoded text) and non-textual information (e.g., images of physicalobjects, persons, and triggers).

Processor 540 may analyze the captured image data in step 804 toidentify portions of the captured image data that include the textualinformation. In one embodiment, as described herein, processor 540 mayperform a layout analysis to detect the textual information within thecaptured image data. By way of example, the detected textual informationmay include, but are not limited to, paragraphs of text, blocks of text,zones and/or regions that include text, logos, titles, captions,footnotes, and any additional or alternate portions of the image datathat includes printed, handwritten, displayed, coded, and/or projectedtext.

In step 806, processor 540 may perform an OCR process on the detectedtextual information to identify and retrieve machine-readable text.Further, in step 808, processor 540 may identify one or more systemcommands associated with the recognized text based on linkinginformation that associates the system commands with correspondingmachine-readable text. In certain aspects, system command database 614may store the linking information, and in step 808, processor 540 mayaccess system command database 614 to obtain the linking information, asdescribed above.

Processor 540 may execute the one or more identified system commands instep 810. Upon execution of the one or more identified system commandsby processor 540, exemplary process 800 ends.

As described above, one or more of the system commands may correspond toan operation that modifies a functional state of apparatus 110 or anexternal device in communications with apparatus 110 (e.g., that causesapparatus 110 to enter a sleep mode, a training mode, or an airplanemode). One or more of the system commands may also correspond toprocesses performed by and supported by an operating system of apparatus110, which include, but are not limited to, processes that initiate orterminate a recording of audio or video content, download a storeddigital image (e.g., to image database 612), perform actions on one ormore files associated with a user of apparatus 110, transmit a storeddigital image to an external device in communication with apparatus 110,update or restart the operating system of apparatus 110, backup storedcontent, obtain information indicative of a status of a battery ofapparatus 110, obtain audible instructions regarding one or morefunctions of apparatus 110, and establish, modify, or erase a usercustomization of apparatus 110 (e.g., a volume associated with apparatus110 or a gender of an audible narration provided by apparatus 110).Further, and consistent with the disclosed embodiments, at least one ofthe system commands may be associated with a plurality of sequentialsteps, which correspond to system sub-commands executed sequentially byprocessor 540.

In the embodiments described above, processor 540 may identify systemcommands associated with one or more of textual information andnon-textual information disposed within captured image data (e.g., step706 of FIG. 7 and step 808 of FIG. 8), and may execute the identifiedsystem commands (e.g., in step 708 of FIG. 7 and step 810 of FIG. 8)without tactile or audible confirmation from a user. In some instances,however, the executed system commands may be associated with significantand often irreversible impacts on an operation of apparatus 110. Forexample, the executed system commands may erase a user-establishedcustomization of apparatus 110, or alternatively, delete or modify oneor more image filed stored by apparatus 110. In some embodiments,described below in reference to FIG. 9, processor 540 may execute theidentified system commands in response to user input that confirms theuser's intentions to execute the identified system commands.

FIG. 9 is a flow diagram of an exemplary process 900 for executingsystem commands based on received user confirmation, according todisclosed embodiments. As described above, processor 540 may identifyone or more system commands associated with at least one of textual ornon-textual information disposed within captured image data. In someembodiments, processor 540 may execute the identified system commands inresponse to confirmation of the user's intention to execute theidentified system commands. Process 900 provides further details on howprocessor 540 requests a confirmation of the user's intention to executea system command, receives and processes input from the user, andexecutes the system command based on the received input.

In step 902, processor 540 may identify a system command associated withtextual information, non-textual information, or combinations of textualand non-textual information within captured image data. For example, asdescribed above in reference to FIGS. 7 and 8, the identified systemcommand may be associated with one or more portions of machine-readabletext retrieved from the textual information using a corresponding OCRprocess, and additionally or alternatively, may be associated withelements of non-textual information disclosed within the captured imagedata.

Further, as described above, the identified system command maycorrespond to an operation that modifies a functional state of apparatus110 or an external device in communications with apparatus 110 (e.g.,that causes apparatus 110 to enter a sleep mode, a training mode, or anairplane mode). In other aspects, the identified system command maycause processor 540 to initiate or terminate a recording of audio orvideo content, download a stored digital image (e.g., to image database612), modify one or more files associated with the user, transmit astored digital image to an external device in communication withapparatus 110, perform an operation on a stored file, update or restartthe operating system of apparatus 110, backup stored content, obtaininformation indicative of status of a battery of apparatus 110, obtainaudible instructions regarding one or more functions of apparatus 110,and establish, modify, or erase a user customization of apparatus 110(e.g., a volume associated with apparatus 110 or a gender of an audiblenarration provided by apparatus 110). Further, and consistent with thedisclosed embodiments, at least one of the system commands may beassociated with a plurality of steps, which correspond to systemsub-commands executed sequentially by processor 540.

Referring back to FIG. 9, processor 540 may request the user confirm anintention to execute of the identified system command in step 904. Inone embodiment, processor 540 may generate an audible request, which maybe presented to the user through a speaker or a bone conductionheadphone associated with processing unit 140. The disclosed embodimentsare, however, not limited to such audible requests, and in furtherembodiments, processor 540 may generate and provide a textual request tothe user (e.g., by transmitting the textual request as a message to amobile communications device of the user in communication with apparatus110), a tactile request to the user (e.g., a vibration of apparatus 110of a predetermined intensity and duration), or through any additional oralternate mechanism appropriate to the user and to apparatus 110.

In step 906, processor 540 may detect user input indicative of aresponse to the request for confirmation. In one embodiment, thedetected user input may include an audible response to the requestedspoken by the user into a microphone associated with apparatus 110.Additionally or alternatively, the user input may include a tactileresponse to the request for confirmation (e.g., the user may tap asensor or other input device disposed on a surface of apparatus 110).The disclosed embodiments are, however, not limited to such exemplaryuser input, and in other embodiments, the user input may include anyadditional or alternate form or combination of inputs appropriate to theuser and to apparatus 110.

Based on the detected user input, processor 540 may determine in step908 whether the user confirmed the intention to execute of theidentified system command. If processor 540 determine that the userconfirms the execution (e.g., step 908; YES), processor 540 may executethe identified system command in step 910, as described above. Exemplaryprocess 900 is then complete. Alternatively, if processor 540 determinesthat the user elects not to confirm the execution (e.g., step 908; NO),exemplary process 900 passes back to step 902, and processor 540identified an additional system command for execution, as describedabove.

Using the embodiments described above, apparatus 110 may capture imagedata that includes one or more of textual and non-textual information,identify a system command that corresponds to the textual and/or visualinformation, and subsequently execute the identified system command tomodify an operational state of apparatus 110. By way of example, a userof apparatus 110 may board an airplane and, upon locating acorresponding seat, browse through materials placed within a pocket orstorage accessible to the user (e.g., a seat-back pocket). Asillustrated in FIG. 10, the user may access an in-flight menu 1000 forthe trip, and apparatus 110 may capture an image that includes a portion1020 of menu 1000 corresponding to a field-of-view of sensory unit 120.

As described above, processor 540 may identify textual informationwithin the captured image, may perform an OCR process that retrievesmachine-readable text from the textual information, and may accesssystem command database 614 to obtain linking information associatingone or more system commands with corresponding portions of therecognized text. For example, as illustrated in FIG. 10, processor 540may leverage the linking information to determine that text portion 1032(e.g., “United™ Economy”), text portion 1034 (e.g., “Welcome Aboard!”),and text portion 1036 (e.g., “flights”) each correspond to a systemcommand that causes apparatus 110 to enter an airplane mode. In certainembodiments, processor 540 may execute the corresponding system command,which disposes apparatus 110 into an airplane mode for the duration ofthe flight.

The disclosed embodiments are, however, not limited to processes thatidentify system commands associated with machine-readable text, and inadditional embodiments processor 540 may identify one or more systemcommands associated with non-textual information disposed within thecaptured image data. For example, as illustrated in FIG. 11, a user ofapparatus 110 may view an in-flight safety video 1100 after boarding anairplane, and apparatus 110 may capture an image that includes a portion1120 of in-flight safety video 1100. Processor 540 may analyze portion1120 to identify an image 1140 of a flight attendant demonstrating aproper technique for securing the user's seat belt, and may leveragelinking information to determine that image 1140 corresponds to a systemcommand that causes apparatus 110 to enter the airplane mode. Asdescribed above, processor 540 may execute the corresponding systemcommand, which places apparatus 110 into an airplane mode during theflight.

In additional embodiments, described above, apparatus 110 may captureimage data that includes textual and non-textual information projectedonto a corresponding surface visible to a user of apparatus 110, and mayidentify and execute one or more system commands associated with theprojected textual and/or non-textual information. By way of example, auser of apparatus 110 may visit a movie theater to view a recentlyreleased feature film. As illustrated in FIG. 12, the theater mayproject a reminder onto a screen 1200 asking viewers to turn off orsilence their mobile communications devices, and apparatus 110 maycapture an image that includes a portion 1220 of the remindercorresponding to a field-of-view of sensory unit 120.

Processor 540 may identify textual information within the capturedimage, may perform an OCR process that retrieves machine-readable textfrom the textual information, and may obtain linking informationassociating one or more system commands with corresponding portions ofthe recognized text. For example, as illustrated in FIG. 12, processor540 may leverage the linking information to determine that text portion1232 (e.g., “Turn Off Your Phones”) corresponds to a system command thatcauses apparatus 110 to enter a “sleep” or “silent” mode. In certainembodiments, processor 540 may execute the corresponding system command,which places apparatus 110 into the corresponding sleep or silent modefor the duration of the feature.

Apparatus 110 may also capture image data including textual andnon-textual information displayed to user of apparatus 110 through adisplay unit or touchscreen of a user device (e.g., a television, asmart phone, tablet computer, laptop, or desktop computer). For example,as illustrated in FIG. 13, the user may view a web page 1300 (or otherelectronic document, such as an email message or text message) thatprompts the user to visit a corresponding “app store” and upgrade anoperating system without cost, and apparatus 110 may capture an imagethat includes a portion 1320 of displayed web page 1300 corresponding toa field-of-view of sensory unit 120.

In certain aspects, processor 540 may identify textual informationwithin the captured image, may perform an OCR process that retrievesmachine-readable text from the textual information, and may obtainlinking information associating one or more system commands withcorresponding portions of the recognized text. For example, as shown inFIG. 13, processor 540 may leverage the linking information to determinethat text portion 1332 (e.g., “Upgrade to the New OS”) corresponds to asystem command that causes apparatus 110 to retrieve and install anupdate to an operating system of apparatus 110. Processor 540 may thenexecute the corresponding system command, which causes apparatus 110 toobtain and install the corresponding update, and further, to restartapparatus 110 to complete an installation process.

Further, in additional embodiments, apparatus 110 may capture image datathat includes handwritten textual information, and may identify andexecute one or more system commands associated with the handwrittentextual information. For example, as illustrated in FIG. 14, a user ofapparatus 110 may receive a letter 1400 from his or her mother that asksthe user to provide copies of digital images by email. In such aninstance, apparatus 110 may capture an image that includes a portion1420 of letter 1400 corresponding to a field-of-view of sensory unit120.

Processor 540 may analyze the captured image data to identify thehandwritten textual information, perform an OCR process that retrievesmachine-readable text from the handwritten textual information, andobtain linking information associating one or more system commands withcorresponding portions of the handwritten text. Using the linkinginformation, processor 540 may determine that text portion 1432 (e.g.,“send copies of your new pictures to me”) corresponds to a systemcommand that causes apparatus 110 to identify one or more stored digitalimages (e.g., within image database 612), and download the identifieddigital images to a user device in communications with apparatus 110over a corresponding wired or wireless communications network.

Processor 540 may execute the corresponding system command, which causesapparatus 110 to download the identified images to the user'scommunications device. Further, in additional embodiments, the executedsystem command may also provide instructions to the user'scommunications device to automatically transmit the downloaded photos tothe user's mother at email address 1434.

In additional embodiments, apparatus 100 may identify and execute systemcommands based on combinations of textual and non-textual informationdisposed within captured image data. For example, a user of apparatus110 may approach an exterior exit of a building, but may be unaware of astreet onto which the exit leads. In such an instance, the user maypoint to an exit sign disposed above the exit door, and apparatus 110may capture an image that include both the exit sign, with itscorresponding textual information, and also an existence of the user'sfinger within the field-of-view of apparatus 110.

By way of example, as illustrated in FIG. 15, a captured image 1500 mayinclude an image of an exit door 1510, textual information correspondingto an exit sign 1520 (e.g., “EXIT TO STREET”), and an image of a trigger1530 (which corresponds to the user's finger). In certain aspects,processor 540 may identify the existence of the textual information andnon-textual information associated with trigger 1530, may identify asystem command that corresponds to the presence of both the textualinformation (e.g., “EXIT TO STREET”) and trigger 1530 within thecaptured image data, and execute the system command to providepositional information to the user of apparatus 110.

In an embodiment, upon execution of the identified system command,apparatus 110 may access a positioning system (e.g., a GPS unit) toobtain a current position of apparatus 110, and access a mapping systemto identify a street onto which exit door 1410 leads. In certainaspects, the positioning and mapping systems may be executed byapparatus 110, or alternatively, may be executed by an external devicein communication with apparatus 110 over a corresponding wired orwireless communications network. Processor 540 may then provide anaudible indication of the determined street to the user of apparatus 110(e.g., through a speaker or a bone conduction headphone associated withprocessing unit 140).

The foregoing description has been presented for purposes ofillustration. It is not exhaustive and is not limited to the preciseforms or embodiments disclosed. Modifications and adaptations will beapparent to those skilled in the art from consideration of thespecification and practice of the disclosed embodiments. Additionally,although aspects of the disclosed embodiments are described as beingstored in memory, one skilled in the art will appreciate that theseaspects can also be stored on other types of computer readable media,such as secondary storage devices, for example, hard disks, floppydisks, or CD ROM, or other forms of RAM or ROM, USB media, DVD, or otheroptical drive media.

Computer programs based on the written description and disclosed methodsare within the skill of an experienced developer. The various programsor program modules can be created using any of the techniques known toone skilled in the art or can be designed in connection with existingsoftware. For example, program sections or program modules can bedesigned in or by means of .Net Framework, .Net Compact Framework (andrelated languages, such as Visual Basic, C, etc.), Java, C++,Objective-C, HTML, HTML/AJAX combinations, XML, or HTML with includedJava applets. One or more of such software sections or modules can beintegrated into a computer system or existing e-mail or browsersoftware.

Moreover, while illustrative embodiments have been described herein, thescope of any and all embodiments having equivalent elements,modifications, omissions, combinations (e.g., of aspects across variousembodiments), adaptations and/or alterations as would be appreciated bythose skilled in the art based on the present disclosure. Thelimitations in the claims are to be interpreted broadly based on thelanguage employed in the claims and not limited to examples described inthe present specification or during the prosecution of the application.The examples are to be construed as non-exclusive. Furthermore, thesteps of the disclosed routines may be modified in any manner, includingby reordering steps and/or inserting or deleting steps. It is intended,therefore, that the specification and examples be considered asillustrative only, with a true scope and spirit being indicated by thefollowing claims and their full scope of equivalents.

1. An apparatus operated by at least one command retrieved from acaptured image, the apparatus comprising: an image sensor configured tobe worn by a user and to capture image data from an environment of theuser; a mobile power source for powering at least the image sensor; andat least one portable processor device configured for tethering to theimage sensor and configured to: identify human-readable text in theimage data, the human-readable text representing a predefined systemcommand; and execute the predefined system command represented by thehuman-readable text after the human-readable text is identified.
 2. Theapparatus of claim 1, wherein the image sensor is configured to bemovable with a head of the user.
 3. The apparatus of claim 1, whereinthe tethering between the at least one processor device and the imagesensor is based on a wired connection.
 4. The apparatus of claim 1,wherein the tethering between the at least one processor device and theimage sensor is based on a wireless connection.
 5. The apparatus ofclaim 1, wherein the at least one portable processor device is furtherconfigured to: access a database of a plurality of system commands; andidentify a portion of the printed information that corresponds to thehuman-readable text, the printed information portion representing acorresponding one of the system commands; and establish thecorresponding one of the system commands as the predefined systemcommand.
 6. The apparatus of claim 5, wherein the database is associatedwith a server having an Internet connection and remotely located withrespect to the apparatus.
 7. The apparatus of claim 1, wherein the leastone processor device is further configured to perform an opticalcharacter recognition on the image data to identify the human-readabletext.
 8. The apparatus of claim 5, wherein the least one processordevice is further configured to perform image processing on the imagedata to identify the human-readable text.
 9. The apparatus of claim 1,wherein the at least one portable processor device is further configuredto identify non-textual information representing the predefined systemcommand within the image data.
 10. The apparatus of claim 1, wherein thehuman-readable text includes hand-written information.
 11. (canceled)12. The apparatus of claim 5, wherein the least one processor device isfurther configured to execute the predefined system command withouttactile input from the user.
 13. The apparatus of claim 1, wherein theleast one processor device is further configured to execute thepredefined system command without audio input from the user.
 14. Theapparatus of claim 1, wherein the predefined system command includes aplurality of steps.
 15. The apparatus of claim 1, wherein the predefinedsystem command includes at least one of the following: enter trainingmode, enter sleep mode, enter airplane mode, start recording, endrecording, download stored photo, backup content, update operatingsystem, restart system, change device configuration, and erasecustomization.
 16. The apparatus of claim 1, wherein the predefinedsystem command includes preforming an action on a particular file. 17.An apparatus operated by at least one command retrieved from a capturedimage, the apparatus comprising: an image sensor configured to be wornby a user and to capture image data from an environment of the user; andat least one portable processor device configured for tethering to theimage sensor and configured to: receive the image data from the imagesensor; identify human-readable text in the image data, thehuman-readable text representing a predefined system command; andexecute the predefined system command represented by the human-readabletext after the human-readable is identified.
 18. The apparatus of claim17, wherein identifying the predefined system command includespreforming optical character recognition on the image data, the opticalcharacter recognition being executed automatically upon receipt of theimage data.
 19. The apparatus of claim 17, wherein identifying thepredefined system command includes performing image processing on theimage data, the image processing being executed automatically uponreceipt of the image data.
 20. The apparatus of claim 17, wherein the atleast one processor device is further configured to execute thepredefined specific system command automatically after the predefinedsystem command is identified.
 21. The apparatus of claim 17, wherein theat least one processor device is further configured to execute thepredefined system command after receiving an audible confirmation fromthe user.
 22. A method for executing at least one command retrieved froma captured image, the method comprising: receiving image data from animage sensor; identifying human-readable text in the image data, thehuman-readable text representing a predefined system command; andexecuting the predefined system command after the human-readable textrepresenting the predefined system command is identified.
 23. A softwareproduct stored on a tangible non-transitory computer readable medium andcomprising data and computer implementable instructions that, whenexecuted by at least one processor, cause the at least one processor toperform a method, comprising: receiving image data from an image sensor;identifying human-readable text in the image data, the human-readabletext representing a predefined system command; and executing thepredefined-system command represented by the human-readable text afterthe human-readable text is identified.